Cooperative Reinforcement Learning Using an Expert- Measuring Weighted Strategy with Wolf

نویسنده

  • Kevin Cousin
چکیده

Gradient descent learning algorithms have proven effective in solving mixed strategy games. The policy hill climbing (PHC) variants of WoLF (Win or Learn Fast) and PDWoLF (Policy Dynamics based WoLF) have both shown rapid convergence to equilibrium solutions by increasing the accuracy of their gradient parameters over standard Q-learning. Likewise, cooperative learning techniques using weighted strategy sharing (WSS) and expertness measurements improve agent performance when multiple agents are solving a common goal. By combining these cooperative techniques with fast gradient descent learning, an agent’s performance converges to a solution at an even faster rate. This statement is verified in a stochastic grid world environment using a limited visibility hunter-prey model with random and intelligent prey. Among five different expertness measurements, cooperative learning using each PHC algorithm converges faster than independent learning when agents strictly learn from better performing agents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy State Aggregation and Policy Hill Climbing for Stochastic Environments

Received (received date) Revised (revised date) Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual ag...

متن کامل

Expertness based cooperative Q-learning

By using other agents' experiences and knowledge, a learning agent may learn faster, make fewer mistakes, and create some rules for unseen situations. These benefits would be gained if the learning agent can extract proper rules from the other agents' knowledge for its own requirements. One possible way to do this is to have the learner assign some expertness values (intelligence level values) ...

متن کامل

Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments

Despite single agent deep reinforcement learning has achieved significant success due to the experience replay mechanism, Concerns should be reconsidered in multiagent environments. This work focus on the stochastic cooperative environment. We apply a specific adaptation to one recently proposed weighted double estimator and propose a multiagent deep reinforcement learning framework, named Weig...

متن کامل

Expertness measuring in cooperative learning

Cooperative Learning in a multi-agent system can improve the learning quality and learning speed. The improvement can be gained if each agent detects the expert agents and use their knowledge properly. In this paper, a new cooperative learning method, called Weighted Strategy Sharing (WSS) is introduced. Also some criteria are introduced to measure the expertness of agents. In WSS, based on the...

متن کامل

Strategic Concept Formation of Consumer Goods Based on Knowledge Acquisition from Questionnaire Data

Product’ concept formation, which occurs in the early stage of product development, is critical to the successfbl development of a new product or to the suitable improvement of a current product. We propose a novel method for computer aided strategic concept formation based on knowledge acquisition from questionnaire data. Product concept should be developed based on consumers’ needs that are u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009